repeated multi-commodity auction
Online Learning of Optimal Bidding Strategy in Repeated Multi-Commodity Auctions
We study the online learning problem of a bidder who participates in repeated auctions. With the goal of maximizing his T-period payoff, the bidder determines the optimal allocation of his budget among his bids for $K$ goods at each period. As a bidding strategy, we propose a polynomial-time algorithm, inspired by the dynamic programming approach to the knapsack problem. The proposed algorithm, referred to as dynamic programming on discrete set (DPDS), achieves a regret order of $O(\sqrt{T\log{T}})$. By showing that the regret is lower bounded by $\Omega(\sqrt{T})$ for any strategy, we conclude that DPDS is order optimal up to a $\sqrt{\log{T}}$ term. We evaluate the performance of DPDS empirically in the context of virtual trading in wholesale electricity markets by using historical data from the New York market. Empirical results show that DPDS consistently outperforms benchmark heuristic methods that are derived from machine learning and online learning approaches.
Reviews: Online Learning of Optimal Bidding Strategy in Repeated Multi-Commodity Auctions
This paper studies the online learning (stochastic and full-information) problem of bidding in multi commodity first price auctions. The paper introduces a polynomial time algorithm that achieves a regret of \sqrt{T log(T)} that has a near optimal dependence on T. The main challenge that the paper has to deal with is to find a computationally efficient algorithm for computing the best biding strategy given a known distribution.The authors first demonstrate that natural approaches for solving this problem exactly are not computationally efficient (this is not a formal np-hardness proof). Then, they provide a FPTAS for solving the problem using dynamic programming. Once they have a FPTAS for the offline problem, their results hold for the stochastic online setting using existing reductions. I haven't carefully looked in to the details of their analysis of the dynamic programming, but I think the effectiveness of it here is interesting and surprising -- specially given that the variation of this problem for the second price auctions is hard to approximate.
Online Learning of Optimal Bidding Strategy in Repeated Multi-Commodity Auctions
Baltaoglu, M. Sevi, Tong, Lang, Zhao, Qing
We study the online learning problem of a bidder who participates in repeated auctions. With the goal of maximizing his T-period payoff, the bidder determines the optimal allocation of his budget among his bids for $K$ goods at each period. As a bidding strategy, we propose a polynomial-time algorithm, inspired by the dynamic programming approach to the knapsack problem. The proposed algorithm, referred to as dynamic programming on discrete set (DPDS), achieves a regret order of $O(\sqrt{T\log{T}})$. By showing that the regret is lower bounded by $\Omega(\sqrt{T})$ for any strategy, we conclude that DPDS is order optimal up to a $\sqrt{\log{T}}$ term.